FAQ: base R

Things to cover

Preliminaries

Load Martin 1992 data

library(compbio4all)
data("martin1995")
martin <- martin1995

Clean data

Change the “non-excavating” group to “secondary cavity”

martin$group <- gsub("non-excavating","2ndary cav.",martin$group)

Change the “group-nesting” group to “ground”

martin$group <- gsub("ground-nesting","ground",martin$group)

Make some more changes to shorten the group names

martin$group <- gsub("Subcanopy or canopy","subcan-/canopy",martin$group)

martin$group <- gsub("shrub or low foliage","shrub",martin$group)

martin$group <- gsub("excavating","excav",martin$group)

Classify general lattitudinal region as tropical, boreal, or temperate

martin$trop.or.temp <- "temperate"
i.tropics <- which(martin$lat < 23)
i.boreal <- which(martin$lat > 54)

martin$trop.or.temp[i.tropics] <- "tropics"
martin$trop.or.temp[i.boreal] <- "boreal"

R Plotting FAQ

Question

T1) How do I make simple tables in R? T2) How do I make 2-way tables in R? T3) How do I make 3-way tables in R?

Tables

Our object martin has a column group giving the general location where each species in the dataframe nests and another columns for where the birds forage.

T1) How do I make simple 1-way tables in R?

We can make a simple table for where birds nest, where they forage and their general lattitudinal region like this:

#Table for nest location
table(martin$group)
## 
##    2ndary cav.          excav         ground          shrub subcan-/canopy 
##             18             11             26             49             18
#Table for forage location
table(martin$foraging.site)
## 
##          Aerial Aquatic    Bark  Canopy  Ground   Shrub 
##       1      15       1      13      24      38      30
#Table for general lattitudal location
table(martin$trop.or.temp)
## 
##    boreal temperate   tropics 
##         3       118         1

A nice command to learn to use in R is with(). Here’s what it can do when making a table; If you were to read this out loud like you were telling R what to do, it would be somethign like this “With the dataframe called data, make a table from the group column.” This isn’t that helpful in this situation, but for 2-way tables and other tasks it is nice.

#Making a table with with()
with(martin, table(group))
## group
##    2ndary cav.          excav         ground          shrub subcan-/canopy 
##             18             11             26             49             18

An aside on tables: The summary() command can also give you simple 1-way tables. However, the table() command will make a table out of whatever you give it, while summary() is more generic. For example, the way that these particular data have been loaded in, R is treating information that is in text form, like nesting location, as text. So when I use summary, I get this

summary(martin$group)
##    Length     Class      Mode 
##       122 character character

R is very picky about differnt kinds of data. In order for R to make a table from this column using the summary() command, you’d need to tell it that the data belong to different categories or different levels of “factor variable”. This is done with the factor() command. here, I’ll tell R that the group column is factor data and then make a table with summary()

#Change character data to a factor
martin$group <- factor(martin$group)

#Now I get a table from summary()
summary(martin$group)
##    2ndary cav.          excav         ground          shrub subcan-/canopy 
##             18             11             26             49             18

T2)How do I make simple 2-way tables in R?

We can cross classify the nesting groups and foraging groups very easily, aka, we can make a 2-way table.

Note that you have to be explicit about where each piece of data from by putting “martin$” in front of both columns

#With nesting location in rows
table(martin$group, martin$foraging.site)
##                 
##                     Aerial Aquatic Bark Canopy Ground Shrub
##   2ndary cav.     0      7       1    2      6      1     1
##   excav           0      0       0   10      1      0     0
##   ground          1      0       0    1      3     18     3
##   shrub           0      5       0    0      1     19    24
##   subcan-/canopy  0      3       0    0     13      0     2
#With nesting location in columns 
table(martin$foraging.site, martin$group)
##          
##           2ndary cav. excav ground shrub subcan-/canopy
##                     0     0      1     0              0
##   Aerial            7     0      0     5              3
##   Aquatic           1     0      0     0              0
##   Bark              2    10      1     0              0
##   Canopy            6     1      3     1             13
##   Ground            1     0     18    19              0
##   Shrub             1     0      3    24              2

You can save yourself some typing by using the with() command. If you read this outloud it would be “with the table dataframe, make a 2-way table from the group and foraging site columns”

with(martin, table(group, foraging.site))
##                 foraging.site
## group               Aerial Aquatic Bark Canopy Ground Shrub
##   2ndary cav.     0      7       1    2      6      1     1
##   excav           0      0       0   10      1      0     0
##   ground          1      0       0    1      3     18     3
##   shrub           0      5       0    0      1     19    24
##   subcan-/canopy  0      3       0    0     13      0     2

T3)How do I make 3-way tables in R?

You can cross-classify categorical data as many ways as you’d like. Let’s make a 3-way classification based on nest type, foraging and lattidue.

table(martin$group, martin$foraging.site, martin$trop.or.temp)
## , ,  = boreal
## 
##                 
##                     Aerial Aquatic Bark Canopy Ground Shrub
##   2ndary cav.     0      0       0    0      0      0     0
##   excav           0      0       0    0      0      0     0
##   ground          0      0       0    0      0      3     0
##   shrub           0      0       0    0      0      0     0
##   subcan-/canopy  0      0       0    0      0      0     0
## 
## , ,  = temperate
## 
##                 
##                     Aerial Aquatic Bark Canopy Ground Shrub
##   2ndary cav.     0      7       1    2      6      1     1
##   excav           0      0       0   10      1      0     0
##   ground          0      0       0    1      3     15     3
##   shrub           0      5       0    0      1     19    24
##   subcan-/canopy  0      3       0    0     13      0     2
## 
## , ,  = tropics
## 
##                 
##                     Aerial Aquatic Bark Canopy Ground Shrub
##   2ndary cav.     0      0       0    0      0      0     0
##   excav           0      0       0    0      0      0     0
##   ground          1      0       0    0      0      0     0
##   shrub           0      0       0    0      0      0     0
##   subcan-/canopy  0      0       0    0      0      0     0

This can be simplified a bit using with()

with(martin, table(group, foraging.site, trop.or.temp))
## , , trop.or.temp = boreal
## 
##                 foraging.site
## group               Aerial Aquatic Bark Canopy Ground Shrub
##   2ndary cav.     0      0       0    0      0      0     0
##   excav           0      0       0    0      0      0     0
##   ground          0      0       0    0      0      3     0
##   shrub           0      0       0    0      0      0     0
##   subcan-/canopy  0      0       0    0      0      0     0
## 
## , , trop.or.temp = temperate
## 
##                 foraging.site
## group               Aerial Aquatic Bark Canopy Ground Shrub
##   2ndary cav.     0      7       1    2      6      1     1
##   excav           0      0       0   10      1      0     0
##   ground          0      0       0    1      3     15     3
##   shrub           0      5       0    0      1     19    24
##   subcan-/canopy  0      3       0    0     13      0     2
## 
## , , trop.or.temp = tropics
## 
##                 foraging.site
## group               Aerial Aquatic Bark Canopy Ground Shrub
##   2ndary cav.     0      0       0    0      0      0     0
##   excav           0      0       0    0      0      0     0
##   ground          1      0       0    0      0      0     0
##   shrub           0      0       0    0      0      0     0
##   subcan-/canopy  0      0       0    0      0      0     0

BP1)How do I make histograms in R?

The basics: R makes histograms in a snap in R, something that I do not know of an efficient way to do in Excel. Let’s make a histogram of the different clutch sizes (number of eggs laid per nest) for the different species. Since I’ve forgotten exactly what I called this column I’ll first figure that out using the names() function, then make a histogram with hist(). R automatically “bins” the data and plots the frequency of observations in each bin along the y-axis.

#Look at the names in the data dataframe
names(martin)
##  [1] "i.bird"        "group"         "spp"           "clutch.sz"    
##  [5] "inc.dur"       "nest.dur"      "nest.suc"      "pred"         
##  [9] "broods"        "surv.adult"    "lat"           "foraging.site"
## [13] "refs"          "trop.or.temp"
#Make a histogram of clutch size
hist(martin$clutch.sz)

Finesse: Changing titles R automatically assigns names to everything. Here I change the x-axis (called the x label) and title, which is called the “main title”.

hist(martin$clutch.sz, 
     xlab = "Clutch size", 
     main = "data's clutch-size data")

Finesse: Adding a vertical line The bird I study typically lays 5 eggs. To make it easy to compare my bird to other birds, I can add a vertical line at 5. This requires two seperate commands. First I’ll make the plot, then add the line. The line is made with the abline() command, which stands for “A-B line”, where A stands for the intercept of a line B stands for its slope. abline() can also make horizontal lines and vertical lines, but the programmers must have thought that abhvline() was too long of a name.

#Make histogram
hist(martin$clutch.sz)

#Add vertical line at 5
abline(v = 5)

The color is black so it blends in. I can change the color using the col= comamnd within abline line

#Make histogram
hist(martin$clutch.sz)
abline(v = 5, col = 2)

BP2)How do I make boxplot in R?

Oneliner:

boxplot(martin$clutch.sz ~ martin$group)

par(mfrow = c(1,1))
boxplot(clutch.sz ~ group, data = martin,
        varwidth = TRUE, #modify the width proportional to sample size
        notch = FALSE,   #notches can be used to judge if medians are          
                         #different.  Work best with relativley large sample                 
                         #sizes.
        col = 2:7)       #colors

Details: Boxplots are recommended by many statisticians as an excellent way to summarize the distribution of data. Boxplots also a breeze in R, while they are more or less impossible in Excel (I’ve seen instructions for them but it looks like a pain). Let’s make a box plot of clutch size compared to birds from different nesting locations. The boxplot() function uses a tilda, ~ in it. Like many R functions, such as those for regression, the continuous variable goes to the left of ~ and the categorical variables go to the right.

boxplot(martin$clutch.sz ~ martin$group)

BP3) How do I make scatterplots in R?

Oneliner: plot(clutch.sz ~ nest.dur, data = martin)

Full code:

#Plot
plot(clutch.sz ~ nest.dur, data = martin,
     main = "Main Title: Clutch size vs. nesting duration",
     sub = "Subtitle: Martin 199x Ecological Monographs",
     xlab = "xlab: nesting duration",
     ylab = "ylab: clutch size",
     col = 2,   #change the color
     pch = 2)   #change th shape of the symbol

#Regression line
abline(a = 3.6,   #intercept
       b = 0.045) #slope

Details: Scatter plots are more or less the default plot type of R. There are several minor variation that all do the same thing. The 1st is probably the easiest to remember because it is similar to the standard format for many other R commands, such as linear regression.

#Using a tilda ~ and "data =""
plot(clutch.sz ~ nest.dur, data = martin)

#Using tilda and explicit column references
plot(martin$clutch.sz ~ martin$nest.dur)

#Using commas and explicit column ref.  NOte that order of data has been flipped.
plot(martin$nest.dur, martin$clutch.sz)

BP4)How do I put a regression line through a scatterplot in R?

Oneliner: plot(clutch.sz ~ nest.dur, data = martin) abline(a = 3.6, b = 0.045)

Details

BP5)How do I put 2 plots next to each other in R?

Oneliner: par(mfrow = c(1,2)) plot(clutch.sz ~ nest.dur, data = martin) plot(clutch.sz ~ surv.adult, data = martin)

Details R can relativley easily format plots next to each other, but the function is rather cryptic, involving the par() command with “mfrow =” in. To plot two figures next to each other use ‘par(mfrow = c(1,2))’, which basically says “put two plots in a 1 by 2 grid.” The default setting for R is par(mfrow = c(1,1)), which corresponds to a 1 x 1 grid. You can probably guess that to put two plots on top of each other you would use par(mfrow = c(2,1)), and so forth.

#Two plots next to each other
par(mfrow = c(1,2))
plot(clutch.sz ~ nest.dur, data = martin)
plot(clutch.sz ~ surv.adult, data = martin)


#Two plots on top of each other
par(mfrow = c(2,1))
plot(clutch.sz ~ nest.dur, data = martin)
plot(clutch.sz ~ surv.adult, data = martin)

Finnese: Lots of tweaks can be done to plot in general, and to multiple plots in a grid, such as the space between each plots. The code, however, is rather cryptic. If you look at the help menu for the par() command you’ll see dozen of options. For quickly making plots look nice, many people just export into Power Point and orient and annotat them there. Learning how to feed commands to par() to make plots and groups of plots look nice can save you time in the long run, but it takes time to figure out what you want and what you like. I find that the R package ggplot2 with its function qplot() makes nicer default plots. GGPLOT has its own complicated syntax, but is prefferred by many R users.

How do I put two plots next to each other using ggplot?

Oneliner library(ggplot2) qplot(y = pred, x = nest.dur, data = martin)

How do I add a regression line to a scatterplot with ggplot?

Oneliner library(ggplot2) qplot(y = pred, x = nest.dur, data = martin, geom = c(“smooth”,“point”), method = “lm”)

Full code qplot(y = pred, x = nest.dur, data = martin, geom = c(“smooth”,“point”), se = FALSE, method = “lm”, xlab = “xlab = nesting duration”, ylab = “ylab = predation rate”, main = “main title: predation vs. nesting duration”)

How do I make a strip chart

Box plot are great but don’t work well for small data set.

#Vertical strip chart
stripchart(martin$clutch.sz ~ factor(martin$group),
           vertical = TRUE)

#Horizontal strip chart
stripchart(martin$clutch.sz ~ factor(martin$group),
           vertical = FALSE)

GGPLOT

library(ggplot2)

Increase the font size for the x and y axes in ggplot?

How do I increae the SIZE OF THE TICK LABELS on the x and y axes?

How do I increase the size of TITLES or labels on the x and y axes?

How do I make these things bold?

Use “theme(axis.text = element_text(size = …))” for the axis labels. Use “theme(axis.tile = element”

library(ggplot2)

qplot(clutch.sz,
       data = martin) + 
  geom_vline(xintercept = 5) +
  xlab("Clutch size") +
  ylab("Count") + 
  ggtitle("Distribution of clutch sizes, data 1992") +
  theme(axis.text=element_text(size=18),               #increase  font size to 18
        axis.title=element_text(size=22,
                                face="bold")) #use bold text

How do I add horizontal and vertical lines to a ggplot?

geom_hline(xintercept = 0) geom_vline(yintercept = 0)

How do I flip or rotate the coordinates to that the y-axis becomes the x-axis in ggplot?

coord_flip()
## <ggproto object: Class CoordFlip, CoordCartesian, Coord, gg>
##     aspect: function
##     backtransform_range: function
##     clip: on
##     default: FALSE
##     distance: function
##     expand: TRUE
##     is_free: function
##     is_linear: function
##     labels: function
##     limits: list
##     modify_scales: function
##     range: function
##     render_axis_h: function
##     render_axis_v: function
##     render_bg: function
##     render_fg: function
##     setup_data: function
##     setup_layout: function
##     setup_panel_guides: function
##     setup_panel_params: function
##     setup_params: function
##     train_panel_guides: function
##     transform: function
##     super:  <ggproto object: Class CoordFlip, CoordCartesian, Coord, gg>

This can be used to make coefficient plots and forests plots.

How do I flip the y axis so that positive values are on the opposite side in ggplot?

scale_y_reverse()

How do I manually set the colors for the legend or scale in ggplot?

How do I change the colors of the legend or scale in ggplot?

scale_colour_manual(name="Experimental\nSettings",
                      values=colors_BeS.3,
                      labels=c("Greenhouse", "Greenhouse\n& Field")) 

How do I make a confidence band around a line in ggplot?

(regression, linear model, CI, confidence interval)

geom_ribbon(aes(ymin=SE.real.minus,
                             ymax=SE.real.plus),
                         alpha=0.15,
                         linetype = 0)

How do i set the limits on the x or y axis in ggplot?

How do I re-name the x or y axis in ggplot?

The commands scale_x_continous() and scale_y_continous() can do several things. See also xlab() and ylab()

scale_x_continuous(name="Stem Length",limits=c(15,100)) + #Stem Length (cm)
scale_y_continuous(name="Reversion Probability")

How do I make changes to the legend in ggplot?

How do I change the position of the legend in ggplot?

How do I change the title over the legend?

How do I change the font of the title of the legend in ggplot?

How do I change the font of the labels of the legend in ggplot?

#keywords: ggplot, legend, font, bold,
theme(
  legend.position="right",
  #legend.position=c(0.15, .85),
  legend.title = element_text(colour="black", 
                              size=24, 
                              face="bold"),
  legend.text = element_text(colour="black", 
                             size = 20, 
                             face = "bold")) +

How do I change the thickness of the axes in ggplot?

theme(axis.line = element_line(size=2, color = "black"))    #makes axes lines thicker

How do I get rid of/remove/suppress the box around the plotting field

# keywords: box, border, edging, around blot
 theme(panel.border = element_rect(color = "white")) 

How do I get rid of the box around…

# keywords: box, border, edging, around blot
theme(legend.key = element_rect(colour = 'white')) +   # gets rid of box around length key items

How do I get rid of the grid lines within the plot in ggplot?

 theme(panel.grid.minor=element_blank(), 
       panel.grid.major=element_blank())

How do I set the font of the axes titles?

How do I adjust the positions of the axes titles?

theme(axis.title.x = element_text(face="bold", size=32),
      axis.text.x  = element_text(vjust=0.95, 
      size=32,
      face="bold"),
      axis.title.y = element_text(face="bold", 
              size=32,
            vjust=0.35),
              axis.text.y  = element_text(vjust=0.5, 
              size=32,face="bold"))

How do I drop, remove or suppress the legend in ggplot?

theme(legend.position = "none")

Remove facet strip completely

http://stackoverflow.com/questions/10547487/r-removing-facet-wrap-labels-completely-in-ggplot2

theme(strip.background = element_blank(),
       strip.text.x = element_blank())

How do I increase the size text of the within facets?

How do I increase the font size of facet labels?

http://stackoverflow.com/questions/2751065/how-can-i-manipulate-the-strip-text-of-facet-plots-in-ggplot2/2751201#2751201

theme(strip.text.x = element_text(size = 8, colour = "orange", angle = 90))

How do I reorder a factor variable for plotting in ggplot?

levs <- mult.comp.df$names.ordered[order.x]
mult.comp.df$names.ordered <- factor(mult.comp.df$names.ordered,
                                     levels = levs)

Colorblind palletes

color blind palletes http://dr-k-lo.blogspot.com/2013/07/a-color-blind-friendly-palette-for-r.html http://www.cookbook-r.com/Graphs/Colors_%28ggplot2%29/

Plot clutch size versus duration of incubation using basic R plotting function

plot(surv.adult ~ clutch.sz, data = martin)

Plot distribution of survival rates

hist(martin$surv.adult)

Plot distribution of clutch sizes

hist(martin$clutch.sz)

Load ggplot package

library(ggplot2)

Plot distribution of clutch sizes in ggplot

qplot(clutch.sz,
       data = martin)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 rows containing non-finite values (stat_bin).

Add line at 6 for LOWA

qplot(clutch.sz,
       data = martin) + geom_vline(xintercept = 5)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 rows containing non-finite values (stat_bin).

Add labels

qplot(clutch.sz,
       data = martin) + 
  geom_vline(xintercept = 5) +
  xlab("Clutch size") +
  ylab("Count") +
  ggtitle("Distribution of clutch sizes, data 1992")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 rows containing non-finite values (stat_bin).

Make background white

qplot(clutch.sz,
       data = martin) + 
  geom_vline(xintercept = 5) +
  xlab("Clutch size") +
  ylab("Count") +
  ggtitle("Distribution of clutch sizes, data 1992") +
  theme(axis.text=element_text(size=18),
        axis.title=element_text(size=22,face="bold")) + 
  theme_bw()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 rows containing non-finite values (stat_bin).

Plot as density curve

qplot(clutch.sz,
       data = martin,
      geom = "density") + 
  geom_vline(xintercept = 6) +
  xlab("Clutch size") +
  ylab("Count") +
  ggtitle("Distribution of clutch sizes, data 1992") +
  theme(axis.text=element_text(size=18),
        axis.title=element_text(size=22,face="bold")) + 
  theme_bw()
## Warning: Removed 1 rows containing non-finite values (stat_density).

Plot as separate density curve for each nest type group

qplot(clutch.sz,
       data = martin,
      geom = "density",
      color = group,
      group = group) + 
  geom_vline(xintercept = 6) +
  xlab("Clutch size") +
  ylab("Count") +
  ggtitle("Distribution of clutch sizes, data 1992") +
  theme(axis.text=element_text(size=18),
        axis.title=element_text(size=22,face="bold")) + 
  theme_bw()
## Warning: Removed 1 rows containing non-finite values (stat_density).

Plot distribution of clutch sizes by FORAGING site

qplot(clutch.sz,
       data = martin,
      facets = . ~ foraging.site)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 rows containing non-finite values (stat_bin).

Use facet wrap

qplot(clutch.sz,
       data = martin) +
  facet_wrap(~foraging.site)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 rows containing non-finite values (stat_bin).

Plot distribution of clutch sizes by habitat group

qplot(clutch.sz,
       data = martin,
      facets = . ~ group)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 rows containing non-finite values (stat_bin).

Boxplot

qplot(y = clutch.sz,
      x = group,
      geom = "boxplot",
       data = martin) +
  geom_hline(yintercept = 6)
## Warning: Removed 1 rows containing non-finite values (stat_boxplot).

Incubation period vs. predation

qplot(y = inc.dur,
      x = pred,
      data = martin)
## Warning: Removed 20 rows containing missing values (geom_point).

Plot adult survival versus duration of incubation in ggplot

qplot(y = surv.adult,
      x = clutch.sz,
      data = martin)
## Warning: Removed 41 rows containing missing values (geom_point).

Plot adult survival versus duration of incubation in ggplot

qplot(y = surv.adult,
      x = inc.dur,
      data = martin)
## Warning: Removed 41 rows containing missing values (geom_point).

Add a trend line

qplot(y = surv.adult,
      x = clutch.sz,
      data = martin,
      geom = c("point","smooth"))
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 41 rows containing non-finite values (stat_smooth).
## Warning: Removed 41 rows containing missing values (geom_point).

Color code each group

qplot(y = surv.adult,
      x = clutch.sz,
      data = martin,
      color = group)
## Warning: Removed 41 rows containing missing values (geom_point).

Make size of points bigger

qplot(y = surv.adult,
      x = clutch.sz,
      data = martin,
      color = group,
      size = 4)
## Warning: Removed 41 rows containing missing values (geom_point).

Add trend line to each group

qplot(y = surv.adult,
      x = clutch.sz,
      data = martin,
      color = group,
      geom = c("point","smooth"),
      method = "lm",
      se = FALSE,
      size = 4)
## Warning: Ignoring unknown parameters: method, se
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 41 rows containing non-finite values (stat_smooth).
## Warning: Removed 41 rows containing missing values (geom_point).

Figure out which species have “warbler” in their name

martin$warbler <- "not warbler"
martin$warbler[grep("Warb", martin$spp)] <- "warbler"
qplot(y = clutch.sz,
      x = warbler,
      geom = c("boxplot","point"),
      data = martin) + geom_hline(yintercept = 5)

OLD / Alt

Things to cover

boxplot - base, qplot, ggplot boxplot ordered histogram scatterplot scatterplot w/vert/hort/1:1 scatterplot with regression lines scatterplot with spline smoother

scatterplot matrix

Load Martin 1992 data

R Plotting FAQ

BP1)How do I make histograms in R?

The basics: R makes histograms in a snap in R, something that I do not know of an efficient way to do in Excel. Let’s make a histogram of the different clutch sizes (number of eggs laid per nest) for the different species. Since I’ve forgotten exactly what I called this column I’ll first figure that out using the names() function, then make a histogram with hist(). R automatically “bins” the data and plots the frequency of observations in each bin along the y-axis.

#Look at the names in the martin1995 dataframe
names(martin1995)
##  [1] "i.bird"        "group"         "spp"           "clutch.sz"    
##  [5] "inc.dur"       "nest.dur"      "nest.suc"      "pred"         
##  [9] "broods"        "surv.adult"    "lat"           "foraging.site"
## [13] "refs"
#Make a histogram of clutch size
hist(martin1995$clutch.sz)

Finesse: Changing titles R automatically assigns names to everything. Here I change the x-axis (called the x label) and title, which is called the “main title”.

hist(martin1995$clutch.sz, xlab = "Clutch size", main = "data's clutch-size data")

Finesse: Adding a vertical line The bird I study typically lays 5 eggs. To make it easy to compare my bird to other birds, I can add a vertical line at 5. This requires two seperate commands. First I’ll make the plot, then add the line. The line is made with the abline() command, which stands for “A-B line”, where A stands for the intercept of a line B stands for its slope. abline() can also make horizontal lines and vertical lines, but the programmers must have thought that abhvline() was too long of a name.

#Make histogram
hist(martin1995$clutch.sz)

#Add vertical line at 5
abline(v = 5)

The color is black so it blends in. I can change the color using the col= comamnd within abline line

#Make histogram
hist(martin1995$clutch.sz)
abline(v = 5, col = 2)

BP2)How do I make boxplot in R?

Oneliner: boxplot(martin1995\(clutch.sz ~ martin1995\)group)

Full code

par(mfrow = c(1,1))
boxplot(clutch.sz ~ group, data = martin1995,
        varwidth = TRUE, #modify the width proportional to sample size
        notch = FALSE,   #notches can be used to judge if medians are          
                         #different.  Work best with relativley large sample                 
                         #sizes.
        col = 2:7)       #colors

Details: Boxplots are recommended by many statisticians as an excellent way to summarize the distribution of data. Boxplots also a breeze in R, while they are more or less impossible in Excel (I’ve seen instructions for them but it looks like a pain). Let’s make a box plot of clutch size compared to birds from different nesting locations. The boxplot() function uses a tilda, ~ in it. Like many R functions, such as those for regression, the continuous variable goes to the left of ~ and the categorical variables go to the right.

boxplot(martin1995$clutch.sz ~ martin1995$group)

BP3)How do I make scatterplot in R?

Oneliner: plot(clutch.sz ~ nest.dur, data = martin1995)

Full code:

#Plot
plot(clutch.sz ~ nest.dur, data = martin1995,
     main = "Main Title: Clutch size vs. nesting duration",
     sub = "Subtitle: Martin 199x Ecological Monographs",
     xlab = "xlab: nesting duration",
     ylab = "ylab: clutch size",
     col = 2,   #change the color
     pch = 2)   #change th shape of the symbol

#Regression line
abline(a = 3.6,   #intercept
       b = 0.045) #slope

Details: Scatter plots are more or less the default plot type of R. There are several minor variation that all do the same thing. The 1st is probably the easiest to remember because it is similar to the standard format for many other R commands, such as linear regression.

#Using a tilda ~ and "data =""
plot(clutch.sz ~ nest.dur, data = martin1995)

#Using tilda and explicit column references
plot(martin1995$clutch.sz ~ martin1995$nest.dur)

#Using commas and explicit column ref.  NOte that order of data has been flipped.
plot(martin1995$nest.dur, martin1995$clutch.sz)

BP4)How do I put a regression line through a scatterplot in R?

Oneliner: plot(clutch.sz ~ nest.dur, data = martin1995) abline(a = 3.6, b = 0.045)

Details

BP5)How do I put 2 plots next to each other in R?

Oneliner: par(mfrow = c(1,2)) plot(clutch.sz ~ nest.dur, data = martin1995) plot(clutch.sz ~ surv.adult, data = martin1995)

Details R can relativley easily format plots next to each other, but the function is rather cryptic, involving the par() command with “mfrow =” in. To plot two figures next to each other use ‘par(mfrow = c(1,2))’, which basically says “put two plots in a 1 by 2 grid.” The default setting for R is par(mfrow = c(1,1)), which corresponds to a 1 x 1 grid. You can probably guess that to put two plots on top of each other you would use par(mfrow = c(2,1)), and so forth.

#Two plots next to each other
par(mfrow = c(1,2))
plot(clutch.sz ~ nest.dur, data = martin1995)
plot(clutch.sz ~ surv.adult, data = martin1995)


#Two plots on top of each other
par(mfrow = c(2,1))
plot(clutch.sz ~ nest.dur, data = martin1995)
plot(clutch.sz ~ surv.adult, data = martin1995)

Finnese: Lots of tweaks can be done to plot in general, and to multiple plots in a grid, such as the space between each plots. The code, however, is rather cryptic. If you look at the help menu for the par() command you’ll see dozen of options. For quickly making plots look nice, many people just export into Power Point and orient and annotat them there. Learning how to feed commands to par() to make plots and groups of plots look nice can save you time in the long run, but it takes time to figure out what you want and what you like. I find that the R package ggplot2 with its function qplot() makes nicer default plots. GGPLOT has its own complicated syntax, but is prefferred by many R users.

How do I put two plots next to each other using ggplot?

Oneliner

library(ggplot2)
qplot(y = pred, x = nest.dur, data = martin1995)
## Warning: Removed 20 rows containing missing values (geom_point).

How do I add a regression line to a scatterplot with ggplot?

Oneliner

library(ggplot2)
qplot(y = pred, x = nest.dur, data = martin1995,
geom = c("smooth","point"), method = "lm")
## Warning: Ignoring unknown parameters: method
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 20 rows containing non-finite values (stat_smooth).
## Warning: Removed 20 rows containing missing values (geom_point).

Full code

qplot(y = pred, x = nest.dur, data = martin1995,
geom = c("smooth","point"),
se = FALSE,
method = "lm",
xlab = "xlab = nesting duration",
ylab = "ylab = predation rate",
main = "main title: predation vs. nesting duration")
## Warning: Ignoring unknown parameters: se, method
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 20 rows containing non-finite values (stat_smooth).
## Warning: Removed 20 rows containing missing values (geom_point).

How do I make a strip chart with base R

Box plot are great but don’t work well for small data set.

#Vertical strip chart
stripchart(martin1995$clutch.sz ~ factor(martin1995$group),
           vertical = TRUE)

#Horizontal strip chart
stripchart(martin1995$clutch.sz ~ factor(martin1995$group),
           vertical = FALSE)

Plot clutch size versus duration of incubation using basic R plotting function

plot(surv.adult ~ clutch.sz, data = martin1995)

Plot distribution of survival rates

hist(martin1995$surv.adult)

Plot distribution of clutch sizes

hist(martin1995$clutch.sz)